home *** CD-ROM | disk | FTP | other *** search
-
- THIS CODE RUNS/COMPILES ON 5.1 Irix OR LATER
- THIS CODE WILL NOT RUN ON LESS THAN 5.1 Irix
-
- THIS DEMO IS BUILT WITH PRE-RELEASED DIGITAL MEDIA LIBRARY CODE.
- THE API AND FUNCTIONALITY ARE SUBJECT TO CHANGE. THE FINAL
- RELEASED VERSIONS OF THE DIGITAL MEDIA LIBRARIES WILL BE
- AVAILABLE AT THE END OF THIS YEAR BY ORDERING THE DIGITAL MEDIA
- LIBRARY DEVELOPMENT OPTION, "SC4-DEMDEV-1.2".
-
- THE SPEECH RECOGNITION DEVELOPER TOOLKIT IS AVAILABLE FROM:
- SCOTT INSTRUMENTS CORP.
- 1111 WILLOW SPRINGS DRIVE
- DENTON, TX 76205
- TEL (817) 387-9514
- FAX (817) 566-3174
-
- ______________________________________________________________________________
-
-
- ~4Dgifts/toolbox/src/exampleCode/speech/lackey README
-
-
- mags 04.04.94
-
- lackey
-
- This is a speech recognition application example. It recognizes
- speech through the use of a speech recognition library. The
- example uses speech to launch desktop applications. Lackey has an
- internal list of "words" (words to be recognized) and their
- corresponding applications to be launched. Saying one of these
- words causes the execution of that application . The current set
- of commands recognize by lackey are clock, shell, apanel, and
- lackey, which can easily be extended.
-
- An audio capable system (Indigo, Indigo2, Indy) and a microphone are
- required.
-
- This example program was developed using SGI's speech recognition
- software. This is an introduction to using the speech C++ API and
- the architecture of the speech software system.
-
- INSTALLATION:
-
- Before you can begin using this application you must update your
- system software to include the speech components. Included on this
- edition of the Developers Toolbox are the inst modules for the speech
- execution and developer subsets. When installed will update your
- system with the speech DSO for your Xserver, the speech templates for
- existing applications (such as Showcase), sounds and images, the speech
- client library, include files, and sample programs.
-
- The speech server is part of the Xserver. Therefore, to activate
- the speech recognition restart your Xserver by logging out of
- your current session.
-
- After you are logged back into your active session, start the
- Speech Recognition panel :
-
- % srpanel &
-
-
- STARTING lackey:
-
- Be sure to set your apanel settings as follows:
- Input sampling rate at 8khz, the input device to microphone and
- the input level to 10.
-
- Now you can start lackey
-
- % lackey &
-
-
- TRAINING THE WORDS:
-
- None of the words that lackey will react to are known to the speech system
- You mustyou must train the recognizer to recognize them. Because this is
- a speaker independent system, the more different people that train the
- words, the better the recognizer will get at recognizing variances in
- the different speakers that use the system. Since you have
- never trained the words to be recognized by the lackey program
- the speech recognition panel will prompt you to train each of
- these new words one at a time. This will be denoted by a picture
- of a cat in the image window of the srpanel. Make sure the
- microphone at least 12 inches away from you. Slowly repeat the
- word (displayed prompt window of srpanel) at a normal tone until the
- word is recognized (at a minimum this will take 4 samples). You will see
- (1/4) near the word being trained. This represents that one of four
- valid samples have collected. Keep repeating the word until all four
- samples have been collected. Repeat this process for the rest of the
- words in lackeys vocabulary. If this process some how aborts or fails,
- you can use the Customization panel found in the srpanel's
- Recognizer pulldown menu.
-
-
- Features of SGI's upcoming release of speech technology:
-
- - speaker-independent discrete-utterance recognition
- - quick response (less than 200 msecs)
- - medium-sized vocabularies (50 active words at a time)
- - no extra hardware required
- - server-based
- - supports multiple speech application clients
- - supports networked speech application clients
- - handles focus policies for speech application clients
- - dispatches recognition events
- - audio samples processed only once by central server
- (for computational efficiency)
- - pretrained words and phrases
- - a suite of selected applications that are "speech-aware"
- - CASE tools
- - Showcase
- - Desktop
- - a vocabulary development system that allows the user to add,
- modify, and delete the words which can be recognized
- - a control panel through which the user can set behavioral
- characteristics such as the acceptance and rejection thresholds
- - a tool to generate actions for applications that are not
- "speech aware" (not listening for speech input)
- - documentation
-
- SGI's developers product for speech-applications also feature:
-
- - and API implemented with C and C++ speech headers & libraries
- - advanced vocabulary development tools
- - a database of pretrained words and phrases
- - documentation
- - programmers' guide
- - API specification
- - policy (style) guide for developing speech application
- behavior and vocabularies
-
-
-
-